Phenotypic correlation results from the combination of genetic and enviromental correlations.
If both traits have low heritabilities then the phenotypic correlation is mainly due to enviromental correlations.
Genetic correlation
Greatest interest for breeders
Positive or negative and favorable or unfavorable, depending on goals. Ex 1: plant height and flowering date in barley are positively correlated. This correlation is favorable if harvest grains is the goal (short and early-maturing). But for forage (tall and late-maturing) is unfavorable (Bernardo et al. 2002).
Causes
Linkage: genes located on the same chromosome are genetically linked and do not segregate independently, like genes located on different chromosomes (can be dissipated by cycles of meiosis).
Pleiotropy: two traits are controlled by the same gene (can not be dissipated, more permanent).
Response to selection
The mean genetic change in the trait of interest in a population.
Direct response to selection
The selection criteria is based on the trait(s) of interest in the target environment.
\({k}_{p}\) is the selection differential when the proportion of selected individuals is \(p\), \({\sigma}_{a}x\) is the standard deviation of breeding values
Indirect selection
The criteria is based on trait(s) that may be associated with the trait(s) of interest (secondary traits). Ex: number of stalks and tons of cane per hectare. Also, the same trait measured in different environments(Rutkoski 2019):
Fig 2. Yield of maize in Tropical environment
Fig 3. Yield of maize in temparate environment
Indirect selection is often used because a secondary trait is easier and/or cheaper to evaluate, and measurable on both sex.
Correlated response to selection
Correlated response to selection: selection for one trait will cause a correlated response in the other (if genetic correlation exists).
Indirect response to selection
\[Ry = {k}_{p} {h}_{y}{\sigma}_{a}y\]
The superiority of indirect over direct selection depends the heritability of secondary trait and the genetic correlation, and when much larger population sizes are possible with indirect selection compared to direct selection.
Multiple trait selection
Examples of multiple-trait selection strategies breeders employ…
Tandem selection
Selection for one trait until that trait is improved, then for a second, etc., until each has been improved to the desired level. Applicable in recurrent selection programs. Ex: tropical maize populations used as breeding material in temperate regions are first selected for photoperiod insensivity prior to selection for other traits.
Code
set.seed(1234)x <-rnorm(200, mean =10, sd =4)y <-rnorm(200, mean =6, sd =2)dat <-data.frame(x,y)# Visualize the scatter plot of traits A and Bplot(x,y, pch=19, ylim =c(min(dat$y), max(dat$y)),xlim =c(min(dat$x), max(dat$x)),rect(par("usr")[1], par("usr")[3],par("usr")[2], par("usr")[4], col ="lightgrey"),cex.lab=1.5, cex.axis=1.8,xlab="Trait A", ylab="Trait B",col=ifelse(y >5, 'red', 'blue'))title(main ="SELECT FOR TRAIT B FOR SEVERAL CYCLES", cex.main =2.5, col.main="darkgreen")abline(h=5, col="black", lwd =2, lty =2)
Code
set.seed(1234)x <-rnorm(200, mean =10, sd =4)y <-rnorm(200, mean =6, sd =2)dat <-data.frame(x,y)# Visualize the scatter plot of traits A and Bplot(x,y, pch=19, ylim =c(min(dat$y), max(dat$y)),xlim =c(min(dat$x), max(dat$x)),rect(par("usr")[1], par("usr")[3],par("usr")[2], par("usr")[4], col ="lightgrey"),cex.lab=1.5, cex.axis=1.8,xlab="Trait A", ylab="Trait B",col=ifelse(x >10, 'red', 'blue'))title(main ="SELECT FOR TRAIT A FOR SEVERAL CYCLES", cex.main =2.5, col.main="darkgreen")abline(v=10, col="black", lwd =2, lty =2)
Independent culling levels
A certain level of merit (minimun level of performance) is established for each trait, and all individuals below that level are discarded regardless of values for other traits.
Code
set.seed(1234)x <-rnorm(200, mean =10, sd =4)y <-rnorm(200, mean =6, sd =2)dat <-data.frame(x,y)# Visualize the scatter plot of traits A and Bplot(x,y, pch=19, ylim =c(min(dat$y), max(dat$y)),xlim =c(min(dat$x), max(dat$x)),rect(par("usr")[1], par("usr")[3],par("usr")[2], par("usr")[4], col ="lightgrey"),cex.lab=1.5, cex.axis=1.8,xlab="Trait A", ylab="Trait B",col=ifelse(x >10& y >5, 'red', 'blue'))abline(v=10, h=5, col="black", lwd =2, lty =2)
Index selection
Select for \(n\) traits simultaneously by using some index of net merit:
\[I = b_{1}X_{1} + b_{1}X_{1} + ... + b_{n}X_{n}\]\(b_{1}\) is the weight for trait \(i\) and \(X_{1}\) is the phenotypic value for trait \(i\).
Example adapted from (Bernardo et al. 2002): Price of 60 kg bag of maize: R$ 90,00 To dry a 60 kg bag of maize: R$ 0,25 Target moisture concentration: 13%
A new candidate maize hybrid genotype has a yield of 100 bags of 60kg/ha and has a moisture of 15%. Then, the profit can be expressed in the form of the following selection index:
The BLUP methodology uses mixed models for the genetic analysis and provides accurate and least biased prediction of breeding values.
Mixed model formulation
\[
y = \mathbf{X}b + \mathbf{Z}u + e
\]\(y\): is an (nx1) vector of observations (phenotypes) \(b\): is an (px1) vector with fixed effects (e.g. environment) \(u\): is an (qx1) vector with random effects of breeding values; \(u \sim N(\mathbf{0},\mathbf{K} \sigma_{g}^2)\) \(e\): is an (nx1) vector with random effects of residuals; \(e \sim N(\mathbf{0},\mathbf{R} \sigma_{e}^2)\) \(\mathbf{X}\) and \(\mathbf{Z}\): are design matrices relating obs to fixed and random effects, respectively.
Uni-trait model
\[
y = \mathbf{X}b + \mathbf{Z}u + e
\]\[\begin{align}
\begin{bmatrix}
u \\
e \\
\end{bmatrix} \sim MVN
\begin{pmatrix}
\begin{bmatrix}
0 \\
0 \\
\end{bmatrix}
,
\begin{bmatrix}
\mathbf{K} & 0 \\
0 & \mathbf{R} \\
\end{bmatrix}
\end{pmatrix}
\end{align}\]
The prediction accuracy of low-heritability, difficult, and/or expensive to measure traits can be increased by using multi-trait models when the degree of correlation between traits is at least moderate, to improve the target trait or all the correlated traits simultaneously (Calus and Veerkamp 2011).
Multi-trait models can be useful for increasing prediction accuracy when the traits of interest are not measured in the individuals of the testing set, but this and other traits were observed in individuals in the training set (Pszczola et al. 2013, Jia and Jannink, 2012).
Installing packages # Github version install.packages('devtools'); library(devtools); install_github('covaruber/sommer') # or # CRAN version install.packages('sommer',dependencies = TRUE)) library(sommer)
Hands-on using R
# rrBLUB (single-trait and no marker information used)# GBLUP (single-trait with marker information used)################################################################################################################################################################# Cross-validation Single-trait models (rrBLUP) using sommer# Datasetsdata(DT_cpdata)# Phenotypic dataDT <- DT_cpdatahead(DT)# Removing rows with NAS because is preventing correlation in corss validation DT <- na.omit(DT)# Marker dataGT <- GT_cpdataGT_cpdata[1:5,1:5]# Chromossome positions#MP <- MP_cpdata#head(MP_cpdata)# Genomic relationship matrixA <- A.mat(GT)colnames(A) <- rownames(A) <- rownames(DT)A[1:5,1:5]# Creating new datasetdata.1.2 <- DT# Scaling phenotypesdata.1.2$Yield <- as.vector(scale(data.1.2$Yield, center = T, scale = T)) # Yielddata.1.2$FruitAver <- as.vector(scale(data.1.2$FruitAver, center = T, scale = T)) # fruit avaragedata.1.2$Firmness <- as.vector(scale(data.1.2$Firmness, center = T, scale = T)) # firmnessdata.1.2$color <- as.vector(scale(data.1.2$color, center = T, scale = T)) # color####################################################################################### Cross-validation Single-trait models (GBLUP) using sommer#30 Replicatesrep <- 3#5 random partitionsfold <- 5n <- dim(data.1.2)[1]Replicates <- data.frame(Replicate = 1:rep,MSEP = NA,Cor = NA) for (j in 1:rep){ set.seed(j) Partitions <- replicate(fold,sample(n,0.20*n,replace = F)) Table <- data.frame(Partitions = 1:fold,MSEP = NA,Cor = NA) for(i in 1:fold) { tst <- Partitions[,i] data.1.2$y_NA <- data.1.2$Yield data.1.2$y_NA[tst] <- NA #M1 fit = mmer(y_NA ~ 1, random=~vsr(id,Gu=A) + vsr(Rowf), rcov=~units, data=data.1.2,verbose = FALSE) #BLUPs bv <- fit$U$`u:id`$y_NA #Prediction of testing yp_tst <- bv[tst] #MSEP and Cor Table$MSEP[i] <- mean((data.1.2$Yield[tst]-yp_tst)^2) Table$Cor[i] <- cor(data.1.2$Yield[tst],yp_tst) } Replicates$MSEP[j] <- mean((na.omit(Table$MSEP))) # mean of folds Replicates$Cor[j] <-mean((na.omit(Table$Cor))) # mean of folds}mean(Replicates$MSEP) # mean of replicatesmean(Replicates$Cor) # mean of replicates####################################################################################### rrBLUP (multi-trait and no marker information used)# GBLUP (multi-trait with marker information used)
References
Bernardo, Rex et al. 2002. Breeding for Quantitative Traits in Plants. Vol. 1. Stemma press Woodbury, MN.
Furbank, Robert T, Jose A Jimenez-Berni, Barbara George-Jaeggli, Andries B Potgieter, and David M Deery. 2019. “Field Crop Phenomics: Enabling Breeding for Radiation Use Efficiency and Biomass in Cereal Crops.”New Phytologist 223 (4): 1714–27. https://nph.onlinelibrary.wiley.com/doi/full/10.1111/nph.15817.
Montesinos López, Osval Antonio, Abelardo Montesinos López, and José Crossa. 2022. Multivariate Statistical Machine Learning Methods for Genomic Prediction. Springer Nature. https://link.springer.com/book/10.1007/978-3-030-89010-0.
Rutkoski, Jessica E. 2019. “A Practical Guide to Genetic Gain.”Advances in Agronomy 157: 217–49.